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^-1002 

1008 

1006 approached this challenge by intr^dtJJrirfe an 

"^SB-AmB CraCS7T«''ZBCeXXlge&t Agent S»fco«CElfr=s" intelllgoat agent'' SSS- 
TEllCE»*'4'» llOMBERal>iatelllgent agent/^RB.AHOBy that aaalyxcs interactions 
be tween user and<RH,JUIOH COMCEPT-^B^e Za£«xeace' SCBCOSasPT-' expert system* 
sarmicato''*'' KmBSR^axexpert »yst«></8H.ABOH> and automatically constructs 
<3ata.base oueries based on tixxs enalysis</XB.ARoa.S>. The user is unobtrusively 
notified when mxomiation relevant to the current [diagnostic context has been 
returned, and nay iamediately access it if desiredL Froa the user's perspec- 
tive all database machinery is entirely transparent; indeed no £oraal query 
language is even made available. Hence we tera thiA approach query-free infor- 
mation retrieval. <p> ^ 

v1002 1008 100^^ 

1006*^^ *^ hope will be apparent from %rtiat follows, the introduction of the 

<BH.A SOH COaCEPT»»Xatelligeat Jtguts* SUBCOBCEPfc* intelligent agenf S»- 
TEWCit-'S" »TOiBSK-2>intelllgwit ageat </BH.AHOK>' additionally offers one solu- 
tion to a fundamental problem facing designers of cooperative infonaaciea 
systems : How can legacy systems of substantial complexity be integrated within 
a larger system context </»H.ATOH.s>^ By requiring that all interactions with 
the legacy database be mediated by dbe agent, we have been able to isolate the 
database system cleanly while still Supporting query- free information 
retrieval. <p> \ 

>I002 1004 inns 

1006^^^^ comprised of the three subsystems alrea^ mentioned: the probabilistic 

<SB.MOa C08CBPT--8ayee Ixxtmrnocm' SUBCOSCEPT^ y4jq>^ system* SBHTraW^tf* ^aaz 
WWBBR-4>expert system </Ka.A»aB>, the leg^ fiill-text database system (to '006 



Which we added a new> senantically-based^^dex^g structure that supports lia-j 
ited <Ba.A»OB COacrPT»-Batnral La&ffuao^ SUBCoJcBPT-i- natural language* SBS- X 
TEHCB—6* OTMBrR«l>natiirml language /imMGa/ queries) . and the <SB.MQB COH- 
CBPT-'lntelligeat Agents* SDBCOBC»*-* intelligent agent* 6ZnESHCB»*6* SON- 
BBaa3>intelligent agent </RB.AxroB/ caat effectively integrates 
them<KH.ABoa.s>. The following sections describe these system components, pro- 
vxde implementation details, illustrate the runtime behavior of FIXIT, report 
on operational experience, and close with some observations about query-free * 
information retrieval and the potential for generalizing the underlying para- 
digm.<p> 

<h2> FIXIT's System Components</h2> 

We first describe the probabilistic expert sub-system and the information 
retrieval sub-system. Before briefly describing these, we stress that our pur- 
pose was not necessarily to advance the capabilities of the individual eospo- 
nents or indeed even to exploit fully the best current technology; instead* we 
focus on their integration. <p> 
<P> 
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AUTOMATIC ADAPTIVE DOCUMENT HELP SYSTC 

The prcsem invention relates to display of dectrcM^ 
particularly to metbod and apparatus for augmenting electronic doonnent display with 
features to enhance the cxpcncoct of readnig an electronic document on a display. 

Increasingly* readers of documents are being called upon to assimilate vast 
quandtiesof information in a short period of time. To meet die demands placed iq)OQ 
diem, readers find they must read documents "horizontalty,* rather dian "vertically,* i.e., 
diey nmst scan, skim, and browse sections of interest in inultiple documents rather than 
read and analyze a single document from beginning to end. 

Docunoents are now more and more available in electronic fiorm. Some 
documents are available electronically by virtue of dieir bavhigl)een locally created usmg 
word processing software.. Otjio' electronic dopuments are.accessible.via the internet Yet 
others may become avaOable in dectronic form 1^ virtue of being scanned in, a 
fexed. Sec commonly assigned U.S. Application No. 08/754,721, cntidedAUTOMAT^^ 
AND TRANSPARENT DOCUMENT ARCHIVING, flie contents of which are herem 
incorporated by reference. 

However, the mere availability of documents in dectronic form does not 
assist Ae reader in con^ntingibe challenges of assxmilaicing information quicUy; ' Indeed, 
mazqr tinie-challenged readers still prefer psq>er documents because of didr portability and 
the ease of flipping through pages. 

Certain tools exist to take advantage of the electronic form documents to 
assist Jiarried readers. Tools exist to search for documents bqdi on the lutetiiet aiid • 
locally. However,' once the docmnent is identified and retneved, further search 
capabilities are limited to k^ord searching. Automatic summarization tedmiques have 
also been developed bm have limitations in that diey are not persoial^ They 
summarize based on general features found in sentences. 



What is needed is a document dBplay syslOT 
weU as assimilate the infonnation he or she warns more quicUy. Hie document display 
system should be easily pcrsonalizable and flexible as wdL 



An autoinatic reading assistance aH)Bcation for documem in elects 
fonn is provided by vimie of die present invention. In certain embodiments, an automatic 
annotator is provided which finds concepts of interest and keywords. The operation of die 
annotaior is pcrsonalizable far a particular user. Theannotator is also cqiableof 
improving its performance overtime by boa automatic and manual feedba^ Tbt 
annotator is usable with any electronic documenL Anoflier available featwe is a elongated 
ihumbnafl image of an or part of a multiinge document wherein a currently displayed 

section of flic document is emphasized in the elongated flmmhnafl image. Moi'ementof flie 
emphasized area in die elongated diumbnail image is Hum gynrhw^p,^ y^fg^ ajolling 
diroogh the document. 

Di accordance wifli a first aspect of die present invention, a mediodftir • 
aimotating an. electronically stored doc^^ includes ste^ of. accepting user ix?mt" • 
indicadi^g nser-q)ecific cono^ts of mterest'. analyzing ttie clectronicdocuniqat'to ideotify 
locations of discussion of die user-specific concepts of interest, and displaying die 
electronic document widi visual indicaticnis of die identified locations. 

. In acconlancewifli a second aspect of die prescm invention, a mefliod for 
displayiig a multi-page document inchides stqps ot displqring a elongated timmbnail 
. image of a mult|.i>|ge;(|9qiipent in a of a display/diqd^t^^ 

™^tH?age documeitt in a second viewing area of die display in l^gM 
enq)basizing an area of dK dongated diumbnaO image cotzesponding to die section 
diq>layed in die second viewing area, accepting user mput controlling sliding of die 
enq>basized area dirongh die diumbnail image, and scrolling the displayed section throng 
• die second viewingireS rwponsive to die scrolling. so diat die enqihasizea area'continnes 
to cotre^nd to tbe di^layed section. 

A further understanding of the naone and advantages of die inventions 
herein may be realized by reference to die remaining portions of die specification and the 
attached drawings , in which: 



Fig. 1 depicts a rcpreseniaiive computer system suitable for implementing 
the present invention. 

Figs. 2A-2D dqpia document browsing displays in accordance with one 
embodiment of the present invention. 

Fig. 3 depicts a document sunsmaxy view in accordance widi one 
embodiment of the present invention. 

Fig. 4 depicts a table of contents view in accordance widi one embodiment 
of the present invention. 

Fig. 5 depicts a top-level software architectural diagram for automatic 
annotation in accordance with one embodiment of die present invention. 

Hgs. 6A-6C depict a detailed software architectural diagram for am nm^r 
annotation in accordance with one embodiment of the present invendon. 

Fig. 7 depicts a representative Bayesian belief network useful in automatic 
annotation in accordance widi one embodiment of the present invention. 

Hg. 8 depicts a user interface for defining a user profile in accordance with 
one embodimexit of the present invention. * 

Figs. 9A-9B depict an interfEure for providing user feedback in accordance 
with one embodiment of the present invention. 

Fig. 10 depicts a portion of an HTML document processed in accordance 
with one embodiment of the present in':ention. 

Computer Svstcm Usable for hnplementiiig the Presgnt Tny^tifi^ 

Fig. 1 depicts a representative conqntter system suitable for inq>lemeoting 
the present invention. Fig. 1 shows basic subsystems of a computer system 10 suitable far 
use with the present invention. In Fig. 1. conqiuter system 10 includes a bus 12 which 
interconnect niajor-subsyjtttns such as a cciattal proccss6r'14; a systml 
ii9)ut/ou^ controUer 18, an-extcmal device such as a printer 20 via a paraUfel port 22, a 
display screen 24 via a display adapter 26, a serial port 28, a keyboard 30. a fixed disk 
drive 32 and a floppy disk drive 33 operative to receive a floppy disk 33A. Many other 
devices may be connected such as a scanner 34 via I/O controller 18, a mouse 36 



conn«^ to serial pon 28 or a network interfece 40. Many other devias w sa^ysi^ 
(not shown) may be connected in a similar manner. Also. itisnotnecesMyforaUofthe 
devices shown in Rg. l to be present to practice the present invention, as discnssed below. 
The devices and subsystems may be interconnected in different ways from 
Rg. 1. The operation of a computer system such as that shown k Rg. 1 A is readDy 
knownintheartandisnotdisoissedindetailinthepresentapplication. Somceoodeto 
implement the present invention may be opeiably disposed in system memoiy 16 « 

on storage media such as a fixed disk 32 or a floppy disk 33A. Image information may be 
stored on fixed disk 32. 

Annotated Dncnrnwif f Tn*'Tf"^ 

TTiepresem invention provides a personalizable system for automatical]y 
annotating docmnentstolocateconceptsofinteresttoapaiticubroser. I^.2Adepicts 
one user interface 200 for V iewing a docimient fliat has been amHJtated in accordana 
the present invention. A first viewmg area 2(tt shows a section of an dectranicdocnment. 
Using a scroll bar 204. or iii other ways; the user may scroll the displayed sectidii through 
the dectionic docnmenL 

A scries of concept check boxes 206 pennit the user .to sdea which 
concepts of mterest arc to be noted in the docnmenL A sensitivity control 208 pennits the 
user to select the degree of sensitivity to apply m identifying potential locadons of relevant 
discussion. At low sensitivity, more locations wiD be denoted as hang relevant, even 
though some may not be of an^r acmal mterest.* At high sensitivity; most all denoted 
locatioiis wiD m fact be relevpt butsome oQm relevaiit Iqcatiobi may be ^nisMd, After 
each concept name appearing by one of cheddwxes 206 sqq)eai$ a pereet^ 
relevance of the currently viewcd document to the concept. These relevance levels ofier a 
qnick assessment of die relevance of the document to the selected concepts. Rg. 2A shows 
no annotations because a phiin text view rather than an annotated view has-been selected 
fiot first, ^dewing' area 202. 

A thumbnail view 214 of the entire document is found in a second viewing 
area215. Details of flinmbnail view 214 wiU be discussed m greater detail bdow. 

Miscellaneous navigation toob are found on a navigation toolbar 216. 



Misceltaneous annotatioD took arc found on an annotation toolbar 218r;^ anDoa^ 
tools on annotation toolbar 218 fadlhate navigation through a coDtetion of documents. 

According to the present invention, annotations may be added to die text 
displayed in first viewing area 204. The annotations denote text rdcvant to user-selected 
concepts. As will be explsaned further below, an automatic annotadon system according to 
the present invention adds these annotations to any document available In electronic fonn. 
TTie document need not include any special infoimatioo to assist in locadng discussion of 
oncepts of interest. 

Rg. 2B dqncts die document view of Hg, 2A but wifli annotadon added in 
first viewing area 202. Phrases 220 have been highlighted to indicate that dwy idate to 
concepts of interest to die user. TTie highlighting is preferably color. However, for ease 
of iUustcarion in black-and-\rtiite format, rectangles indicate the higWig htPi.^ areas of text 
For fimher emphasis, the higfaUgfated text is preferably printed in bold. A rectanguhir bar 
222 indicates a paragraph flat has been determined to have relevance above a 
predetennined tiireshold or to have more flian a flneshold number of key phrases. 
Rectangular bar 222 is merely rqiresentative of varioasfonns of marginal annotation fliat 
might be used to indicate a relevant sectim of die text 

Fig'. 2C dqricts an alternative style of annotation. 'Now in first viwiiing arra 

202, entire sentences 224 including phrases relevant to concepts of mterest aie higtiiight*^ 

TTie phrases diemselves are printed in bold text It has been found fliat highlighting flie 

entire sentence raflier flian just a relevant phrase provides die user wifli fer more 
infbtination at a glance. 

Rg. 2D; depites how fuither'infotmatioti''aiiODt k^'ptnases ma^ pc 
diqilayed. The user may select a^ highlighted key phrase widi die moose. Upwi 
sdection Of die kqrphrase^ a balloon 226 sqjpears. The balloon includes fbrdier 
mformation relevant to die key phrase. For example, die balloon may inctade die name of 
die concept to which die keyword is relevant. The balloon may also include bibUogr^ihic 
inftnmation if the key.-jdirasc -includes a'citatiOiL 

Fig. 3 depicts a document summary view in accordance with one 
embodunent of die present invention. The user may optionally selea a summary view 300 
of die document Summary view lists die concepts of interest 302 flat are found in flie 
documents as headings of an oudine. For each.concept, keywords or key phrases 304 are 



listed which arb iittlicadye.of the ro^ 

keyword indicates the number of times the keyword or key phrase appears. Each concqn 
also has an associated score 306 indicadve of the idevance of the whole document to the 
concq)t. 

Fig. 4 dq)icts a table of contents view in accordance with one embodiment 
of the present invention. Analtenativetosummary view 300 is a table of contents view 
400. Table of contents view 400 lists major headmgs 402 and subheadmgs 403 of the 
electronic document. By selecting one of hieiaichical di^lay icons 404. flie user may list 
the concepts 406 found under one of die document headings 402 or subheadings 403 wtSx 
anmdicationofrdcvancetoeachconoptandlhemnhberofkcywordsfbund. TTiereis 
also a relevance meter 408 for each document headmg 402 that indicates the overaD 
relevance ofthe lea under thatheading for aHofdiecurienfly selected concepts. Iha 
. Prefcnedembodhnea where the documem is an HTML document, t^ 
cements view 400, the headhigs of the document are identified by an analysis of the 
HTML heading tags. 

Automatic Annntatin p Software 

Rg. 5 dqace a top-levd soi&ware architecoiral diagram 
amiotationmaccorthmcewithoneembodhnemofthepresattinve^ AdocumentSOZ 
exists in dectromc form. It may have been scanned in origmally. It may be. eg., in 
HTML. Postscript. UTeX. odier word processing or e-mail formats, etc. The desoiptior 
that follows assumes an HTML format. A ns<ir 504 accesses document 502 throng a 
document bn?wse^.506.aad ah ainfotation a^^^^ Document l)ibwsff. 506. is prefeiafti^ 
a hypertext browsing program such as Netscape Navigator or ' 
tD?y be. e.g.. a conventional woid processing program. 

Annotation agent 508 adds die annotations to document 502 to prepare it for 
viewing by document browser 506. . Processmg by annotation ^ent 508 may be 

.understood ^^ be indirce st^es, a t^ pritce^mg itage 510.a content teco^nitipn stage 

512.andaforinatimgstage514. The mput to text processing stage 510 is raw text. Tit 

outpw from text processirig stage 510 and mput to content recognition stage 512 is a 
parsed text stream, a text stream witir formatting information such as special tags around 
particular words or phrases removed. The output from content recognition sage 512 and 
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input 10 fomatting stage 5 14 is an annotated text stream. The poipjit of fomiatdng stage 
514 is a formatted text^e viewable with document browser 506, 

The processing of annotation agent 508 is preferably a nm-tune process. 
The annotations arc not preferably pre-insertcd mto the text but are rather geoeiated when 
5 user 504 requests document 502 for browsing. Tlius. this is prcfiOTbly a ctynamk: process 
Annotation agent 508 may also, however, operate in the background as a batch process. 

The annotation added by annotation agent 508 dqpeods on coaoepts of 
interest selected by user 504. User 504 also iiqyots infbimatioa used by annotation agem 
508toidentiiy locations of discussion of concepts of interest in doaiment 502. In a 
10 preferred embodiment, this information defines the structure of a Bayesian belief network. 
The concepts of interest and other user-specific information arc mamtan i ffj a user 
profile file 516. User 504 employs a profile editor 518 to modify the conicols of user 
profile file 516. 

Fig. 6A dq>icts Ae automatic annotation software architecture of Fig. 5 

15 with text processing stage 510 shown in greater detail. Rg. 6A shows that the source of 
document 502 may be accessed via a network 602. Possible sources include e.g., the 
Internet 604, an intrant 606, a digital copier 608 that captures document images, at n^ f^r 
office equq7ment'610 such as a fax xnadime; scanner, printer, etc^ Anodier alterhative 
source is the user's own hard drive 32. 

20 Text processing stage 510 includes a file I/O stage 612, an updating stage 

614, and a language processing stajge 616. File I/O stage reads the document file from 
network 602. Updating stage 614 inaintains a history of recently visited documents in a 
history file 618. Language processing suige 616 psgs«r the .text of dqam|ent5Ga t0 
generate the parsed text ottq>ut of text processing stage 510. 

15 Fig. 6B dq)icts the automatic annotation software architecture of Fig. S with 

content recognition stage 512 shown in greater detaiL A pattern identification stage 620 
looksforparticularpattemsindieparsed text ou^ut of text processing stage 510. The 
particular patterns searched fot are determinid.by. the coritenii:of.uscr profile file 516. /• 
Once the patterns are found, annotation tags are added to the parsed text by aii annotation 

iO tag addition stage 622 to indicate the pattern locations. In a preferred HTML embodiment, 
these annotation tags are compatible with the HTML formaL However, the taggirig 
process may be adapted to UTeX. Postscript, etc. A profile updating stage 624 monitors 



Guxpat of annotadon tag addition stage 622 and analyzes tact SuriouncB^ tte locations 
of concepts of interest As wiU be further discussed wii reference to Fig. 7 cbangcs tbe 
contents ofuserpn)ffleffle 516 based on the anatysis of fliis surrounding tctt Tbecffect 
is to amomaticaUy refine the patterns searched for pattern ide^ 
improve annotation perfbnnance. 

Fig. 6C dq>icts die automatic annotation software architecture of Fig. 5 with 
formatting stage 514 shown in greater detail. Fdimatting stage 514 inchides a text 
rendering stage 626 that fbrinats the annotated text pr^^ 

512 to fadlilate viewing by docnmem browser 506. M HTML document as modified by 
formatting stage 514 is discussed in greater detail with reference to Fig. 10. 

Pattern identification stage 620 looks for keywords and key phrases of 
interest and locates rdcvam discussion of concqrts based on the located keyword The 
identification of keywords and the appUcation of the kqrwords to locating relevant 
discussion is preferably accompliAed by refinenoe to The belief system is 

preferably a B^edan belief network. 

dq)icts a portion of a rq)resemative Bayesito beKcf iietwork 700 
inqjlcmcndng a belief system as used by pattern identification stage 622. A first oval 702 
rq>resents a particular'uscr-^pedficd concept of intei^ Oflier ovals 704 represent 
subconcepts related to the concept identified by oval 702. Eadi Ime between one of 
subconccpt ovals 704 and concept oval 702 indicates that discussion of flie subconccpt 
implies discussion of die concqrt. Each connection between one of subconccpt ovals 704 
and concept oval 702 has an associated probability vahxe mdicated in perceiiL These 
vOues in tuxn^indiciB the probability, that the condeji: is discussed given-die presracc of 
evidence indicating die presence of die subconccpt Discussion of the subconcept is in turn 
indicated by one or more keywords or key phrases (not shown in Fig. 7). 

The strucnire of Bayesian belief network 700 is only one possible structure 
applicable to flie prcsent invention. For exanqjle. one could employ a Bayesian belief 
network s* more dian two levels of hierarchy so fliat die presence of subconccpte is 
suggested by die presence of "subsubconcepts" and so on. In the preferred embodiment, 
presence of a keyword or key phrase always infficates presence of discussion of die 
subconcqit but h is also possible to configure die befief network so diat presence of a 
keyword or key phrase suggests discussion of die subconccpt widi a specified probabflity. 



The primary source for the structure of Bayesian Wief M:twork^700 
including the selection of concepts, keywords and key phrases, interconnections, and 
probabilities is user profile file 516. In a preferred embodiment, user profile file 516 is 
selectable for both editing and use from among profiles for many users. 
5 The structure of belief system 700 is however also modifiable during use of 

the annotation system. The modifications may occur automadcally in tbe background or 
may involve explicit user feedback input The locations of concq>ts of interest determined 
by pattern identification stage 620 are monitored by profile updating stage 624. Profile 
updating stage 624 notes the proximi^ of other keywords and key phrases within each 

10 analyzed document to the locations of concepts of interest. If particular keywords and key 
phrases are always near a concq>t of interest, the structure and contents of belief system 
700 are updated in the background without user ioput by profile updating stage 624. This 
could mean changing probabiliQ^ vahies, introducing a new connection between a 
subconcq)t and concept, or introducing a new keyword or key phrase. 

15 User 504 may select a word or phrase in document 502 as being relevant to 

a particular concept even though the word or phrase has not yet defined to be a keyword or 
key phrase. Bdief system 700 is then updated to include the new k^word or key phrase 
Us^ 504 may also give feedback jfbr an existing key word or key phrase, 
indicating the perceived relevance of the keyword or key phrase to the concq)t of interest 

20 If the selected keyword or k^ phrase is indicated to be of high relevance to die concq>t of 
interest the probability values connecting die subconcept indicated by the selected 
keywords or key phrases to the concept of interest increases. If, on the other hand, user 
504:indi(ates the selected icyv/ords or key phrase$ to be of little interest' the probability' . 
values connecting these kqnvords or key phrases to the concq>t decrease. 

25 

User Profile and Feedback Interfaces 

Fig. 8 depicts a user interface for defining a user profile in accordance with 
one" embodiment of the present tiivehtion. User interface screen 800' is provided.by profite, * 
> editor 518. A profile name box 802 permits the user^tp exiter the name of die person or 
30 group to whom the profile to be edited is assigned. This permits the annotation system 
according to the present invention to be personalized to particular users or groups. A 
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password box 804 provides security by requiring entry.of a correct passwoid im'or to 
profile editing operadons; 

A defined conopts list 806 lists all of the concqws which have already been 
added to the user profile. By selecting a concept add button 808. the user ns^ add a new 
concept By selecting a concqn edit button 810. the usermay modify the belief netwo^ as 
it pertains to the listed concqH that is cuirenfly selected. By sdecting a renwve bntioii 
812. the user oi^ dd^ a concqtt. 

If a concept has been selected fbr editing, its name appears in a concept 
namebox813. Theportionof die belief network pertaining to the selected concq* is 
shown in a belief network display window 814. Belief network display window 814 shows 

the selected concept, flie subconcq»ts which have been defined as rdaling to the sde^ 
amcq* and the percentage vahies associated with each relationdiq). The user may add a 
subconccpt by selecting a subooncqit add button 815. The user may edit a siibconcept by 
selecting the subconcq« in beUef network display window 814 and dien selecting a 
subconcept edit button 816. A subconcqn remove button 818 peimits the user to ^ete a 
sobconcept from die bdief network. 

&lectii« snbconcqrt add button 81ScansK a subconcept add wiodow 820 
to ^)p<jar. Subboncqjt add wmdbw 820 inchides.a snbomcept name box-822 for entering 
Ae name of a new subconcept. A slider control 824 pemdts^ user to sdea the 
percentage vahie that defines the probability of the sdectsd concept qipearing given that 
the newly selected subconcqjt appears. A keyword list 826 lists the keywords and key 
Iteases which indicate discussion of the subconcept The user adds to the list by selecting 
akeywwd add buOon 828 which^caoses display oCa dialog box (Mt shoTmJ'for entHing. 
die new keyword or kqr phrase. The user delves a keyword or key phrase by sdectmg it 
and then selecting a fcqrwordddete button 830. Once the user has finished defining the 
new subconcqrt, he or she confirms dlie definition by selectmg an OK button 832. 
Selection of a cani^ button 834 dismisses subcdncq>t add window 820 without afiFecting 
theleliirf^nebwkcontattror^rticture, SeleS^aia pfsubconcqiteditbimon HS "* 
display of a window similar to subconcept add window 820 permitting redefinition of the 
selected subconcept. 

By sdecdng ixdiedier a background learning checkbox 836 has been 
selected, the user may enable or disable die (q>eradon of profile updating stage 624. A 



web autdfetch check box 838 pennits the user to iwlea whethef 6^0*^ e^^ 
automatic web search process. When this web search process b enabled, whenever a 
particular keyword or key phrase is found firequently near where a defined concept is 
detennined to be discussed, a web search tool such as AltaVista- is employed to look on 
the World Wide Web for docmnents containiag the keywoid or key phrase. A threshold 
slider control 840 is provided to enable the user to set a threshold relevance level fijr this 
autofetching process. 

Figs. 9A-9B depia a user interfiice for providing feedback in accordance 
with one embodiment of the present mvention. User 502 may selea any tew andcalli^a 
first feedback window 902. The text may or may not have been previously identified by 
the annotation system as relevant Infirstfcedback window 902 shown in Fig. 9A. user 
504 may indicate flic concqK to which the selected text is relevauL First feediack window 
902 may not be newssaiy when adjusting flui rdevance levd for a keyword or kgr phra» 
that is already a part of belief network 700. Alter flje user selects a concept in first 

feedback window 902, a second feedback window 904 is displayed for sdecting die degree 
ofrdevance. Second feet&ack window 904 in Fig: 9B provides three choices for level of 
relevance: good, medhm (not sure), and bed. Akwnaiivdy; a slider control could be used 
to set flie l<ivd of iti^^. If the seleded tew b ^ already a kqfword or toy jthrase ii 
belief network 700. a new subconcept is added along wifli flie associated new keyword or 
keyphrase. K the selected text is already a keyword or k^ phrase, above, probability 
values widiin beUef system 622 are nwdified appropriatdy m 
feedbadL. 

. jFig. 10 d^cts a portion oif an HTML documoit lOOO jvtxxssed m - . . 
accordance with one embodiment of die present invention. A sootence induding relevant 
text is preceded ana <RHj\NOH.S ...> 1002 and foUowed by an 
</RH.ANOH.S > tag 1004. The use of dwse tags fiualitates die annotation mode where 
complete sentences are hishTi g ht r d . flie <RH.ANOH.S ... > tag 1()02 includes a inimber 
indicatii« which relevajot sentcsnce is tagged in" order of ia^jpearaife in t^ docinnenL. 
Relevant text widiina so-tagged relevant sentence is jaeceded by an <RH.ANOH ... > 
tag 1006 and followed by an <ARH.ANOH> tag 1008. The <RH.ANOH ... > 1006 
tag inchjde die names of die concept and subconcept to which die annotated text is 
relevant, an identifier indicating which relevant sentence die text is m and a numbw whidi 



idendfies TRiiich anhotatioB tbis is in sequence for a piffticular conc^ An HTML 
browser thai has not been modified to interpret the special annotation tags provided by the 
present invaition win ignore diem and display the document widwut annotations. 



Thnmhnail Tnwf g ni^p^ay 

Refening again to Rgs. 2A-2D, an eloiigated thumbnail image 214 of man/ 
pages, or all ofdocument 502 is presented in second viewing area 215. Document5Q2 
will typicafly be a multiiage document with a section being displayed in first viewing ar^ 
202. ElOTgatBdflnnnbnail image 214 provides a convenient view of die basic document 
structure. The annotations incorporated into the document are visible wifliin dongaied 
thumbnail image 214. Within elongated fhrnnhnafl image 214, an wnptia^irnl area 214A 
shows a reduced view of the document secdoo cutrenlty displayed in first viewfaig area 215 
widi the reduction ratio prefiaably being user-configurable. Thus, if the first vieivingarwi 
202 changes in size because of a dange of window aze, en^ihasized area 214A win also 
change in size accordingly. The greater die vievniig area aUocated to elongated itaumbnan 
image 214 and emiAasized area 214A, ifae nwre detail is visible. Widi very »maii 
aUpcated yiewing areas, only secdcms <rf die dncnmpiir may he diMm gmgiwK lf As die 
allocated ark increases,' indhridDal lines and cnfenmally indivi^ 
distinguishable. In Rgs. 2A-2D die user-configured rado is approximately 5:1. 
Emph asized viewing area 214 may be understood to be a lens or a viewing window over 
die part of el(»gated drnmbnafl nnage 214A corresponded to die document secdon 
displayed in first viewing area 215. User 504 may scroU dirough document 502 by sliding 
einphasl^.fflea 214A.ig» and dbw^ 

document 502 displayed in first viewing area 202 wffl also shiftl User 504 also scroU 

convendonaUy using scrtdl bar 204 or arrow Iceys and en^hasized area 214A wffl slide 19 
or down as app ro pr iate in response. 

In Figs. 2A-2C elongated thumbnail image 214 displays eadi page of 

documeittJ02as.being-di^layed a t^^ same'Tcduced scale. "The'p^ent.iivardQn.also • 

contenqilates odier modes of scaSng elongated dnunbnafl miage 214. For exanqile, one 

may display emphasized area 214A at a scale similar to diat shown in Figs. 2A-2C and use 

a variable scale for die rest of elongated diumbnail image 214. Tejct fhmi fer away 
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enqjhasized area 2i4A would be displayed at a higUy. reduced scale and the degree of 
magnificatioii would increase with oeaniess to wnphac fy^ 214A. 

Because, the annotations appear in enlongatcd thiwiirnai^ image 214, it is 
very easy to find relevant text anywhere in document 502. Furthermoie, elongated 
Ununbnail image 214 provides a highly useful way of keeping track of one's position 
widiin a lengdiy document. 

Software TmplementatiQii 

In a preferred embodiment, software to inclement the present invention is 
written in the Java language. Preferably, the software forms a part of a stand-alone 
browser program written in tte Java language. Alternatively, the code m;^ be in flic form 
of a sonalled "plug-in* operating with a Java-equipped web browser used to browse 
HTML documents including the spedal annotatiMi tags e^lained above. 

In die fwegdng spedficadon, tbe inventioi has been described with 
reference to spedficexenq>laiyembodimHits thereof. Forexan?>le,anypr(*ablIistic 
inference method may be substituted fiff a Bayesian belief network. It wStt, liowever, be 
evi<tent fliat various inodfications and changes may be made jOwreunto widiout departing 
from the broader spirit and scope of die inveaation as set fiwth in tiie :Q>peiuied daims and' 
their ftiU scope of equivalents. 
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CLAIMS: 




1 . An automatic adaptive document help system for annotating ah electronically 
stored documrat, the system comprising: 

means for storing a user-specified concept of interest; 

means for locating discussion of said concept of interest within the 
electronically stored document; and 

means for displaying said electronic document with visual indications of said 
identified locations. 



2. A computer-implemented method for annotating an electronically stored 
document comprising the steps of: 

accepting user input indicating a user-specified concept of interest; 
analyzing said electronic document to identify locations of discussion of said 
user-specified concept of interest; and 

displaying said electronic document with visual indications of said identified 
locations. 

3. The method of claim 2 wherein said analyzing step comprises exploiting a 
probabilistic inference method to identify said locations. 

4. The method of claim 3 wherein said probabilistic inference method comprises a 
Bayesian belief networic. 



5. The method of claim 4 further comprising the step of: 

accepting user input defining a structure of said Bayesian belief network. 

6. The method of claim 4 or claim 5 fiirther comprising the step of 
modifying said Bayesian belief network in accordance with content of 

previously visited electronic documents. 
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7, The method of any one of the claims 2 to 6 wherein said dis^ying step 
comprises the substep of: 

highlighting sections of said document surrounding said locations. 

8 The method of any one of the claims 2 to 6 wherein said displaying step 
comprises the substep of: 

displaying a balloon pointing to a user-selected one of said locations, said 
balloon identifying said user-specified concept to which text in said user-selected one 
said locations is relevant. 

9. The method of any one of the claims 2 to 6 wherein said displaying step 
comprises the substep of: 

displaying marginal notation identifying said locations. 

1 0. The method of claim 4 or claim 5 further comprising the steps of: 
accepting user input indicating a degree of relation between said locations and 

said concept of interest; and 

modifying said Bayesian belief network responsive to said degree of relation. 

1 1 . The method of any one of the claims 2 to 1 0 further comprising the step of 
displaying a level of relevance of said document to said concept of interest. 

12. A computer-implemented method for displaying a multipage document 
comprising the steps of: 

displaying an elongated thumbnail image of a multi-page document in a first 
viewing area of a display; 

displaying a section of said multi-page document in a second viewing area of 
said display in legible form; 

emphasizing an area of said thumbnail image corresponding to said section 
displayed in said second viewing area; 
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accepting user input controlling sliding said emphasized areff^ugh said multi 
page document; and " . - " . ' - ' 

scrolling said displayed section in said second viewing area responsive to said 
sliding so that said emphasized area continues to correspond to said displayed section. 

13. The method of claim 12 further comprising the steps of; 
accepting user input indicating user-specific concepts of interest; 
analyzing said multi-page document to identify locations of discussion of said 

user-specific concepts of interest; 

marking said locations in both said thumbnail image and in said displayed 
section in said second viewing area. 

14. A computer program product for annotating an electronically stored document 

comprising: 

code for accepting user input indicating a user-specified concept of interest 
code for analyzing said electronic document to identify locations of discussion 

of said user-specified concepts of interest; 

code for displaying said electronic document with visual indications of said 

idoitified locations; and 

a computer-readable storage medium for storing the codes. 

15. The product of claim 1 4 wherein said analyzing code comprises code for 
exploiting a probabilistic inference method to identify said locations. 

1 6. The product of claim 1 5 wherein said probabilistic inference method comprises 
a Bayesian belief network. 



The product of claim 16 further comprising code for: 

accepting user input defining a structure of said Bayesian belief network. 
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1 8. The product of claim 1 6 or claim 1 7 further comprising code'^r modifying said 
Bayesian belief network iii accordance with coment.of said electrohic document 

19. The product of claim 18 wherein said modifying code comprises code for 
updating said Bayesian belief network in accordance with proximity of keywords to 
said identified locations. 

20. The product of any one of the claims 14 to 19 wherein said displaying code 
comprises code for highlighting said locations. 

21 . The product of any one of the claims 14 to 19 wherein said displaying code 
comprises code for highlighting sections of said document surrounding said locations. 

22. The product of any one of the claims 14 to 19 wherein said displaying code 
comprises code for displaying balloons pointing to said locations. 

23. The product of any one of the claims 14 to 19 wherein said displaying code 
comprises code for displaying marginal notations identifying said locations. 

24. The product of claim 16 or claim 1 7 further comprising: 

code for accepting user input indicating a degree of relation between said 
locations and said concepts of interest; and 

code for modifying said Bayesian belief network responsive to said degree of 
relation. 

25. The product of any one of the claims 14 to 24 further comprising code for 
displaying a level of relevance of said document to said concept of interest. 

26. A computer program product for displaying a multipage document comprising: 
code for displaying an elongated thumbnail image of a multi-page document in a 

first viewing area of a display; 
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code for displaying a section of said multi-page document in%x>nd viewing 
area of sdd display in I^bleforin;" • - ... . . ? "T. ' 

code for emphasizing an area of said thumbnail image corresponding to said 
section displayed in said second yiewing area; 

code for accepting user input controlling sliding of said emphasized area 
through said thumbnail image; 

code for scrolling said displayed section so that said displayed section continues 
to correspond to said emphasized area; and 

a computer-readable storage medium for storing the codes. 

27. The computer program product of claim 26 further comprising: 

code for accepting user mput indicating user-specific concepts of interest; 
code for analyzing said multi-page documem to identify locations of discussion 
of said user-specific concepts of interest; and 

code for marking said locations in both said thumbnail image and in said 
displayed section in said second yiewing area. 

28. A computer system comprising: 
a processor; and 

a computer-readable storage medium storing code to be executed by said 
processor, said code comprismg: 

code for accepting user input indicating user-specific concepts of interest; 
code for analyzing an electronic document to identify locations of discussion of 
said user-specific concepts of interest; and 

code for displaying said electronic document with visual indications of said 
identified locations. 

29. A computer program product for displaying a multipage document comprising: 

code for displaying an elongated thumbnail image of a multi-page document in a 
first viewing area of a display; 
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code for displaying a section of said multi-page document-m^second viewing 
area of said display in legible form; " - ; • • 

code for emphasizing an area of said thumbnail image corresponding to said 
section displayed in said second viewing area; 

code for accepting user input indicating user-specific concepts of interest; 

code for analyzing said multi-page document to identify locations of discussion 
of said user-specific concepts of interest; and 

code for marking said locations in both said thumbnail image and in said 
displayed section in said second viewing area. 

30. A computer system comprising: 
a processor; and 

a computer-readable storage medium storing code to be executed by said 
processor, said code comprising: 

means for accepting user input indicating user-specific concepts of interest; 

means for analyzing an electronic document to identify locations of discussion 
of said user-specific concepts of interest; and 

means for displaying said electronic document with visual indications of said 
identified locations. 

31. A computer system for displaying a multipage document comprising: 

means for displaying an elongated thumbnail image of a multi-page document in 
a first viewing area of a display; 

means for displaying a section of said multi-page document in a second viewing 
area of said display in legible form; 

means for emphasizing an area of said thumbnail image corresponding to said 
section displayed in said second viewing area; 

means for accepting user input indicating user-specific concepts of interest; 

means for analyzing said multi-page document to identify locations of 
discussion of said user-specific concepts of interest; and 
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m^ns for marking said locations in both said tiiumbnal_im;^e^ in said 
displayed section in said secbnd vi^i^g aiea. ' 
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